93 research outputs found

    Deepfake detection: humans vs. machines

    Full text link
    Deepfake videos, where a person's face is automatically swapped with a face of someone else, are becoming easier to generate with more realistic results. In response to the threat such manipulations can pose to our trust in video evidence, several large datasets of deepfake videos and many methods to detect them were proposed recently. However, it is still unclear how realistic deepfake videos are for an average person and whether the algorithms are significantly better than humans at detecting them. In this paper, we present a subjective study conducted in a crowdsourcing-like scenario, which systematically evaluates how hard it is for humans to see if the video is deepfake or not. For the evaluation, we used 120 different videos (60 deepfakes and 60 originals) manually pre-selected from the Facebook deepfake database, which was provided in the Kaggle's Deepfake Detection Challenge 2020. For each video, a simple question: "Is face of the person in the video real of fake?" was answered on average by 19 na\"ive subjects. The results of the subjective evaluation were compared with the performance of two different state of the art deepfake detection methods, based on Xception and EfficientNets (B4 variant) neural networks, which were pre-trained on two other large public databases: the Google's subset from FaceForensics++ and the recent Celeb-DF dataset. The evaluation demonstrates that while the human perception is very different from the perception of a machine, both successfully but in different ways are fooled by deepfakes. Specifically, algorithms struggle to detect those deepfake videos, which human subjects found to be very easy to spot

    Video quality for video analysis

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Towards optimal distortion-based visual privacy filters

    Get PDF
    The widespread usage of digital video surveillance systems has increased the concerns for privacy violation. Since video surveillance systems are invasive, it is a challenge to find an acceptable balance between privacy of the public under surveillance and security related features of the systems. Many privacy protection tools have been proposed for preserving privacy, ranging from such simple methods like blurring or pixelization to more advanced like scrambling and geometrical transform based filters. However, for a given filter implemented in a practical video surveillance system, it is necessary to know the strength with which the filter should be applied to protect privacy reliably. Assuming an automated surveillance system, this paper objectively investigates several privacy protection filters with varying strength degrees and determines their optimal strength values to achieve privacy protection. To this end, five privacy filters were applied to images from FERET dataset and the performance of three recognition algorithms was evaluated. The results show that different privacy protection filters influence the accuracy of different versions of face recognition differently and this influence depends both on the robustness of the recognition and the type of distortion filter

    Joint Operation of Voice Biometrics and Presentation Attack Detection

    Get PDF
    Research in the area of automatic speaker verification (ASV) has advanced enough for the industry to start using ASV systems in practical applications. However, as it was also shown for fingerprints, face, and other verification systems, ASV systems are highly vulnerable to spoofing or presentation attacks, limiting their wide practical deployment. Therefore, to protect against such attacks, effective anti-spoofing detection techniques, more formally known as presentation attack detection (PAD) systems, need to be developed. These techniques should be then seamlessly integrated into existing ASV systems for practical all-in-one solutions. In this paper, we focus on the integration of PAD and ASV systems. We consider the state of the art i-vector and ISV-based ASV systems and demonstrate the effect of score-based integration with a PAD system on the verification and attack detection accuracies. In our experiments, we rely on AVspoof database that contains realistic presentation attacks, which are considered by the industry to be the threat to practical ASV systems. Experimental results show a significantly increased resistance of the joint ASV-PAD system to the attacks at the expense of slightly degraded performance for scenarios without spoofing attacks. Also, an important contribution of the paper is an open source and an online-based implementations of the separate and joint ASV-PAD systems

    Presentation attack detection in voice biometrics

    Get PDF
    Recent years have shown an increase in both the accuracy of biometric systems and their practical use. The application of biometrics is becoming widespread with fingerprint sensors in smartphones, automatic face recognition in social networks and video-based applications, and speaker recognition in phone banking and other phone-based services. The popularization of the biometric systems, however, exposed their major flaw --- high vulnerability to spoofing attacks. A fingerprint sensor can be easily tricked with a simple glue-made mold, a face recognition system can be accessed using a printed photo, and a speaker recognition system can be spoofed with a replay of pre-recorded voice. The ease with which a biometric system can be spoofed demonstrates the importance of developing efficient anti-spoofing systems that can detect both known (conceivable now) and unknown (possible in the future) spoofing attacks. Therefore, it is important to develop mechanisms that can detect such attacks, and it is equally important for these mechanisms to be seamlessly integrated into existing biometric systems for practical and attack-resistant solutions. To be practical, however, an attack detection should have (i) high accuracy, (ii) be well-generalized for different attacks, and (iii) be simple and efficient. One reason for the increasing demand for effective presentation attack detection (PAD) systems is the ease of access to people's biometric data. So often, a potential attacker can almost effortlessly obtain necessary biometric samples from social networks, including facial images, audio and video recordings, and even extract fingerprints from high resolution images. Therefore, various privacy protection solutions, such as legal privacy requirements and algorithms for obfuscating personal information, e.g., visual privacy filters, as well as, social awareness of threats to privacy can also increase security of personal information and potentially reduce the vulnerability of biometric systems. In this chapter, however, we focus on presentation attacks detection in voice biometrics, i.e., automatic speaker verification (ASV) systems. We discuss vulnerabilities of these systems to presentation attacks (PAs), present different state of the art PAD systems, give the insights into their performances, and discuss the integration of PAD and ASV systems

    Using Warping for Privacy Protection in Video Surveillance

    Get PDF
    The widespread use of digital video surveillance systems has also increased the concerns for violation of privacy rights. Since video surveillance systems are invasive, it is a challenge to find an acceptable balance between privacy of the public under surveillance and the functionalities of the systems. Tools for protection of visual privacy available today lack either all or some of the important properties such as security of protected visual data, reversibility (ability to undo privacy protection), simplicity, and independence from the video encoding used. In this paper, we propose an algorithm based on well-known warping techniques (common for animation and artistic purposes) to obfuscate faces in video surveillance, aiming to overcome these shortcomings. To demonstrate the feasibility of such an approach, we apply warping algorithm to faces in a standard Yale dataset and run face detection and recognition algorithms on the resulted images. Experiments demonstrate the tradeoff between warping strength and accuracy for both detection and recognition

    Impact of score fusion on voice biometrics and presentation attack detection in cross-database evaluations

    Get PDF
    Research in the area of automatic speaker verification (ASV) has been advanced enough for the industry to start using ASV systems in practical applications. However, these systems are highly vulnerable to spoofing or presentation attacks, limiting their wide deployment. Therefore, it is important to develop mechanisms that can detect such attacks, and it is equally important for these mechanisms to be seamlessly integrated into existing ASV systems for practical and attack-resistant solutions. To be practical, however, an attack detection should (i) have high accuracy, (ii) be well-generalized for different attacks, and (iii) be simple and efficient. Several audio-based presentation attack detection (PAD) methods have been proposed recently but their evaluation was usually done on a single, often obscure, database with limited number of attacks. Therefore, in this paper, we conduct an extensive study of eight state-of-the-art PAD methods and evaluate their ability to detect known and unknown attacks (e.g., in a cross-database scenario) using two major publicly available speaker databases with spoofing attacks: AVspoof and ASVspoof. We investigate whether combining several PAD systems via score fusion can improve attack detection accuracy. We also study the impact of fusing PAD systems (via parallel and cascading schemes) with two i-vector and inter-session variability based ASV systems on the overall performance in both bona fide (no attacks) and spoof scenarios. The evaluation results question the efficiency and practicality of the existing PAD systems, especially when comparing results for individual databases and cross-database data. Fusing several PAD systems can lead to a slightly improved performance; however, how to select which systems to fuse remains an open question. Joint ASV-PAD systems show a significantly increased resistance to the attacks at the expense of slightly degraded performance for bona fide scenarios

    UHD Video Dataset for Evaluation of Privacy

    Get PDF
    Ultra High Definition (UHD) is one of the emerging immersive video technologies already available to public, as even some of the smartphones are capable of capturing UHD video. The increasingly widespread availability of UHD capable recording devices has important implications on privacy. This paper addresses the problem by proposing a publicly available UHD video dataset designed for evaluation of privacy issues. The dataset depicts typical surveillance scenarios of people fighting, exchanging bags, walking, and stealing, in an indoor and outdoor environment. The dataset also includes the data from the subjective assessment, which evaluated the impact of UHD on privacy compared to a currently common High Definition (HD) video and declining Standard Definition (SD) video. The results of the assessment not only demonstrate that UHD is a significantly more privacy intrusive technology when compared to HD and SD used today, but they also quantify the impact of intrusiveness
    corecore